Fast Rounding in Multiprecision . . .

نویسندگان

  • J. H. McClellan
  • C. M. Rader
چکیده

to Section III, the factorized algorithm computes (14) and (15) with 36 multiplications and 90 additions. Other symmetries of (14) and (15) can be exploited: the first terms of the sums in (14) and also the first of d, and the second of di', the first of d; and the second of dd, the first of di' and the second of d; are pairwise equal, therefore, only 18 multiplications and 48 additions in GF(2) are needed. Without taking into account the time for input/output operations, a completely narallel realization of the Massey-Omura multiplier (with elementary nrocessors canable of the binarv GF(2) onerations between two boerand) muliinlies in five clock pulses~ the factorized algorithm multiplies in six (but with a greater communication complex%y). 1) An important part of the algorithmic principle presented in Section II is the factorization along the chain of fields (10). This principle can be applied to algorithms different from the one considered in Section III. For example, the algorithm presented in [20] is intrinsically sequential and factoring it along the chain (10) gives a much more parallel procedure. 2) Consider the basic algorithm between F,-;+ 1 and Fs-i+z in (12). The coefficients uk,((j+ 1)) in (7) are fixed elements of F,-;+ 2 so they induce a linear transformation into F,-i+2 which can be computed in no more than (m;-i *. . m1)2 operations in F. With this modification in the case m = 2 " (example l), the factorized algorithm requires no more than m2(l + s/4) multiplications in F. 3) Algorithms based on the bilinear renresentation of Section II allow a highly parallel implementation which computes a product in a finite field GF(2 ") in time log2 n. This is true also for the factorized algorithm of Section III with the modification of remark 2. It is worthwhile to notice that multiplication algorithms derived from efficient multiplication and division algorithms for polynomials (as the FFT and the Schonhage-Strassen algorithms [ 11, [lo], [ 141, [ 151) do not allow a parallel implementation running in time linear in log2 n. 4) The basis of E over F resulting from the algorithm factorization exhibits a " nested structure " as in (16). It is worthwhile to notice that, in general, it is not a normal basis for Fs-i+ 1 over F [13]. 5) The matrix Ac5) of the bilinear renresentation (2) associated with the basis (16) in example …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient multiprecision floating point multiplication with optimal directional rounding

An algorithm is described for multiplying multipre-cision oating point numbers. The algorithm produces either the smallest oating point number greater than or equal to the true product or the greatest oating point number smaller than or equal to the true product. Software implementations of multiprecision precision oating point multiplication can reduce the computing time by a factor of two if ...

متن کامل

Double-least-significant-bits 2's-complement number representation scheme with bitwise complementation and symmetric range

A scheme is proposed for representing 2’s-complement binary numbers in which there are two least-significant bits (LSBs). Benefits of the extra LSB include making the number representation range symmetric (i.e. from 22 to 2 for k-bit integers), allowing sign change by simple bitwise logical inversion, facilitating multiprecision arithmetic and enabling the truncation of results in lieu of round...

متن کامل

Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates

Exact computer arithmetic has a variety of uses including, but not limited to, the robust implementation of geometric algorithms. This report has three purposes. The first is to offer fast software-level algorithms for exact addition and multiplication of arbitrary precision floating-point values. The second is to propose a technique for adaptive-precision arithmetic that can often speed these ...

متن کامل

On fast matrix-vector multiplication with a Hankel matrix in multiprecision arithmetics

We present two fast algorithms for matrix-vector multiplication y = Ax, where A is a Hankel matrix. The current asymptotically fastest method is based on the Fast Fourier Transform (FFT), however in multiprecision arithmetics with very high accuracy FFT method is actually slower than schoolbook multiplication for matrix sizes up to n = 8000. One method presented is based on a decomposition of m...

متن کامل

A fast and exible software library for large integer arithmetic

An ANSI C library of subroutines for multiprecision operations on unsigned integers is presented, that is both fast and exible. Usability and applicability of such a library are shown to depend both on the basic design decisions as well as on the library's actual functionality. Basic design decisions are the choice of programming language, the representation of multiprecision integers, error ha...

متن کامل

Toom-Cook Multiplication: Some Theoretical and Practical Aspects

Toom-Cook multiprecision multiplication is a well-known multiprecision multiplication method, which can make use of multiprocessor systems. In this paper the Toom-Cook complexity is derived, some explicit proofs of the Toom-Cook interpolation method are given, the even-odd method for interpolation is explained, and certain aspects of a 32-bit C++ and assembler implementation, which is in develo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1989